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Abstract: The Pareto probability distribution is widely applied in different fields such us finance, 
physics , hydrology , geology and astronomy. This note deals with an application of the Pareto 
distribution to astrophysics and more precisely to the statistical analysis of mass of stars and of diameters 
of asteroids. In particular a comparison between the usual Pareto distribution and its truncated version 
is presented. Finally a possible physical mechanism that produces Pareto tails for the distribution of the 
masses of stars is suggested. 
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1 Introduction 



The Pareto distribution [HE]- is a simple model for nonnegative data with a power law 
probability tail. In many practical applications, it is natural to consider an upper bound 
that truncates the tail [SI HI |5]; the truncated Pareto distribution has a wide range of 
applications in several fields in data analysis [3] [B]. 

Power law distributions are often found in astrophysics: for instance in the range 
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IMq < M < IOMq, the mass of the stars (MAIN SEQUENCE V), when expressed 
in terms of the solar mass AiQ, scales as ip^Ai) oc A4~" with a= 2.35, see [7j or a= 2.3 as 
suggested by a recent evaluation , see [8] . Other examples are the intensity of non thermal 
emission from supernova remnant and extra-galactic radio-sources that scales as z/~°, with 
numerical values of a ranging between 0.5 and 1, the observed differential spectrum of 
cosmic rays proportional to E~'^'^^ in the interval 10^° eV — 5.0 10^^ eV [21 [ID], the 
gamma ray bursts luminosity function that scales as [HI [12]. Of course the Pareto 
distribution is not the only one to exhibit power law tail, this behaviour being common 
to different distributions (e.g. the lognormal distribution); however Pareto distributions 
are specially attractive for their simple analytical form. 

In this paper we present in Section [2] a comparison between the Pareto and the trun- 
cated Pareto distributions. In Section [3] the theoretical results are applied to distribution 
of astrophysical data, namely the mass of stars and the radius of asteroids. A physi- 
cal mechanism that produces a Pareto type distribution for the masses is presented in 
Section H] 



2 Preliminaries 

Let X be a random variable taking values x in the interval [a, oo], a > 0. The probability 
density function (in the following pdf) named Pareto is defined by [2] 

/(a;;a,c) = ca^x-(^+^) , (1) 

c > 0, and the Pareto distribution functions is F{x : a, c) = 1 — a'^x"^ 

An upper truncated Pareto random variable is defined in the interval [a, h] and the 
corresponding pdf is 

ca''x~^''~^^^ 

frix; a, b, c) = , (2) 



.6 

[5] and the truncated Pareto distribution function is 



1 



' a\c 



FT{x;a,b,c) = —^^ . (3) 

Momenta of the truncated distributions exist for all c > 0. For instance, the mean of 
fxix; a, b, c) is, for c 7^ 1 and c = 1, respectively, 

ca 1 - i^y-^ ca" h 
<x>= — , <x>= TTcln- (4) 

c-1 i-ifY i-(f « 



Similarly, if c 7^ 2, the variance is given by 
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whereas for c = 2 

In < X . (6) 



1 - (t) 



a 



In general the n — th central moment is 



a 



b 

{x- < X >YfT{x)dx = 
< X >)" a-'^ 2Fi{-c, -n- 1 - c; ^—) ((a^)"' - (6^)"''"' 



-(-<x>rr%Fi(-c,-n; 1-c; ) (a^)"^ _ (6^)-M (7) 

< 2; > ^ ^ 

where 2-^1(0, c; 2;) is a regularized hypergeometric function, see [ISIIIIIIIS]. An anal- 
ogous formula based on some of the properties of the incomplete beta function, see [16] 
and [IT] , can be found in [18]. 

Parameters of the truncated Pareto pdf from empirical data can be obtained via the 
maximum likelihood method; explicit formulas for maximum likelihood estimators (MLE) 
are given in [3], and for the more general case in [5J, whose results we report here for 
completeness. 

Consider a random sample A:" = xi, X2, . . . , and let X(i) > X(2) > • • • > X(^n) denote 
their order statistics so that a;(i) = max(a;i, X2, . . . , Xn), X(n) = min(a;i, 0:2, . . . , Xn)- 
The MLE of the parameters a and b are 

a = X(„), b = X(^i), (8) 

respectively, and c is the solution of the equation 

I + r^r^y - - InxH] = 0, (9) 



[5]. 

There exists a simple test to see whether a Pareto model is appropriate [5]: the null 
hypothesis i^o : = co is rejected if and only if < [nC/(— In g)]^/'^, < g < 1, where 
C = a'^. The approximate p- value of this test is given by p = exp |— nCx^^j, and a small 
value of p indicates that the Pareto model is not a good fit; of course this is not enough 
per se to demonstrate the goodness of a truncated Pareto distribution. 

Given a set of data is often difficult to decide if they agree more closely with / or /y, 
in that, in the interval [a, b], they differ only or a multiplicative factor 1 — {a/by, that if 
the interval [a, b] is not too small approaches 1 even for relatively small values of c. For 
this reason, rather than / and /t, the distributions P{X > x) and P{X > x) are used, 
often called survival functions, that are given respectively by 

P{X >x) = S{x) = 1 - F{x; a, c) = a^x"" (10) 

and 

Prix >x) = Srix) = 1 - Ft(x; a, 6, c) = ^ , ' . (11) 
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Fig. 1 log-log plot of the survival function: 10000 random data (empty circles), generated 
with Eq. f|T2l) . survival function of the truncated Pareto distribution (full line) and 
survival function of the Pareto distribution (dotted line). 

Probabilities P and Pt have qualitatively different trends that are better observed 
in a log- log plot. In this case P is obviously represented by a straight line, whereas Pt 
exhibits also a almost linear trend with a sharp drop when x tends to h. To illustrate 
this point we have generated a set oi n = 10000 random points drawn from a truncated 
Pareto distribution, via the formula 

X:aAc-a(l-R{l~{^r)Y , (12) 

where R is the unit rectangular variate, and we have fitted them with S and St respec- 
tively, see Figure [TJ 

3 Applications 

3.1 Mass of stars 

The sample of star's masses has been obtained from the Hipparcos data as a function 
of the absolute magnitude and (B-V) [19]. Results of the fitting with P and Pt are 
presented in Table [T] where a, 6, c and n, the number of sample elements, are reported 
and in Figure [2] that shows the data with the fit. 

In this case in the range 3.44A^q > M. > O.SSA^q, see Table [T] , the coefficient 
a = c + 1 =2.45 is in agreement with modern estimates In this case, the power of the 
Pareto test results to be p = 0.032, indicating that the Pareto distribution is not a good 
fit, as can also be seen from Figure [2] . 
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Table 1 Coefficients of mass distribution of the stars in the first 10 pc, of a complete 
sample (MAIN SEQUENCE V). The parameter c is derived through MLE and p=0.032. 
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Fig. 2 log-log plot of the survival function of the mass distribution of the stars: data 
(empty circles), survival function of the truncated Pareto pdf (full line) and survival 
function of the Pareto pdf (dotted line). A complete sample (MAIN SEQUENCE V) is 
considered with parameters as in Table [H 



Table [2] therefore reports the of the fit of the stars when the Pareto and the 
truncated Pareto, respectively. 



Table 2 of different distributions when the number of bins is 5 for the stars in the 



first 10 pc . 


Distribution 
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Pareto 


7.1 


Truncated Pareto 


5.26 
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3.2 Distribution of asteroids size 

Suppose that not just masses of stars but also those of other astrophysical objects have 
a power law tail, then is not difficult to prove that also their linear dimension, radii 
or diameters, must follows a power law. We have tested this hypothesis by considering 
diameters of different families of asteroids, namely, Koronis , Eos and Themis. 

In the following The sample parameter of the families are reported in Table [3], Table H] 
and Table , whereas Figure El Figure HI Figure O report the graphical display of data 
and the fitting distributions. 

Table 3 Coefficients of diameter distribution of the Koronis family . The parameter c is 
derived through MLE and p=0.033 . 
a [km] b [km] c n P(X > x) 

25.1 44.3 3.77 29 truncated Pareto 

25.1 oo 5.04 29 Pareto 



Table 4 Coefficients of diameter distribution of the Eos family . The parameter c is 
derived through MLE and p=0.681 . 
a [km] b [km] c n P{X > x) 

30.1 110 3.80 53 truncated Pareto 

30.1 cxD 3.94 53 Pareto 



Table 5 Coefficients of diameter distribution of the Themis family . The parameter c is 
derived through MLE and p=0.67 . 
a [km] b [km] c n P{X > x) 

35.3 249 2.5 53 truncated Pareto 

35.3 oo 2.6 53 Pareto 



In case of the Koronis family Pj- fits the data better than P and indeed p = 0.039 
is correspondingly small, whereas for the Eos family, P performs slightly better than Pt 
(p=0.68), and the estimated of c are very closed in both cases. Finally in the third case, 
the Themis family, the two distributions are the same, due to the fact that the ratio 
a/b = 0.14 is small. 

4 Generating Pareto tails 

As a simple example of how a distribution with power can be generated, consider the 
growth of a primeval nebula via accretion, that is the process by which nebulae "capture" 
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Fig. 3 In-ln plot of the survival function of diameter distribution of the Koronis Family: 
data (empty circles), survival function of the truncated Pareto pdf (full line) and survival 
function of the Pareto pdf (dotted line). A complete sample is considered with parameters 
as in Table El 
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Fig. 4 In-ln plot of the survival function of diameter distribution of the Eos Family: 
data (empty circles), survival function of the truncated Pareto pdf (full line) and survival 
function of the Pareto pdf (dotted line). A complete sample is considered with parameters 
as in Table HI 
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Fig. 5 In-ln plot of the survival function of diameter distribution of the Themis Family: 
data (empty circles), survival function of the truncated Pareto pdf (full line) and survival 
function of the Pareto pdf (dotted line). A complete sample is considered with parameters 
as in Table [5l 

mass. We start by considering an uniform pdf for the initial mass of primeval nebulae, 
m, in a range m^m < fn < m^ax ■ At each interaction the z-th nebula has a probability 
Aj to increase its mass mi that is given by 

Ai = (1 — exp(— afcmj)), (13) 

where ak is a parameter of the simulation; thus more "massive" nebulae are more likely 
to grow, via accretion. The quantity of which the primeval nebula can grow varies with 
time, to take into account that the total mass available is limited, 

5m(t) = 5m(0)exp(-t/r) , (14) 

where 5m(0) represents the maximum mass of exchange and r the scaling time of the 
phenomena. The simulation proceeds as follows: a number r, is randomly chosen in the 
interval [0, 1] for each nebula, and, if r < Aj, the mass is increased by 6m{t), where t 
denotes the iteration of the process. The process proceed in parallel : at each temporal 
iteration all the primeval nebulae are considered. 

Results of the simulations have been fitted with both Pareto survival distributions, 
see Figure E] 

Due to a photometric effect [12] the sample of observed stars is complete only for 
m > 0.5A4q. We therefore have set the lower boundary of the masses to O.5A40, 
and the resulting subset has been fitted with the Pareto and truncated Pareto survival 
distributions. Figure El It should be noted that the results of the simulation give c = 1.36, 
that is a = 2.36 in agreement with the experimental estimate. 
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Fig. 6 log-log plot of the survival function of the mass distribution for the primeval 
nebula when m > O.SA^q are considered. The truncated Pareto parameters are c=1.36 
and p=0.0058 . 

5 Conclusions 

Results of the analysis presented here show that the truncated Pareto distribution pro- 
vides a good fit for the distribution and performs better than the usual Pareto distribu- 
tion. When the asteroid diameters are considered the situation is not so clear in that it 
depends on the family one considers. It is also clear the there can be cases, such as with 
the Themis family, in which the ratio between the minimum and the maximum value of 
the sample is so small that there no real difference between the two distributions. Finally 
we have shown that Pareto distributions can result from a simple growth process, in 
which the increase of the state variable (here mass) depends on the values taken in the 
previous state; furthermore results of the simulations agree well with the experimental 
data. 

As remarked earlier distributions are not the only statistics with a power law tail; also 
in astrophysics alternatives have been proposed for the statistics of asteroid diameters 
(e.g. pOj)- However Pareto distributions are particularly simple; for instance note that 
they have just a free parameter c, the others a and b, being determined by the minimum 
and maximum values of the sample, respectively. 

References 

[1] V. Pareto, Cours d' economic politique. Rouge, Lausanne, 1896. 

[2] M. Evans, N. Hastings, P. B., Peacock,B. , Statistical Distributions - third edition. 



10 



L. Zaninetti and M. Ferraro / Versita Physics .. 2007 1-10 



John Wiley & Sons Inc, New York, 2000. 

[3] A. Cohen, B. Whitten, Parameter Estimation in rehabihty and Life Span Models, 
Marcel Dekker, New York, 1988. 

[4] D. Devoto, S. Martnez, : Truncated pareto law and oresize distribution of ground 
rocks. Mathematical Geology , Vol. 30 (6) , (1998), pp. 661 - 673. 

[5] I. Aban, M. Meerschaert, A. Panorska, : Parameter estimation for the truncated 
pareto distribution , Journal of the American Statistical Association , Vol. 101 , 
(2006), pp. 270-277. 

[6] K. Rehfeldt, J. M. Boggs, L. W. Gelhar, : Field study of dispersion in a heterogeneous 
aquifer. 3: Geostatistical analysis of hydraulic conductivity. Water Resour. Res. , Vol. 
28 , (1992), pp. 3309-3324. 

[7] E. E. Salpeter, : The Luminosity Function and Stellar Evolution., ApJ , Vol. 121 , 
(1955),pp. 161-+. 

[8] P. Kroupa, : On the variation of the initial mass function, MNRAS , Vol. 322 , 
(2001), pp. 231-246. 

[9] K. R. Lang, Astrophysical formulae. Springer, New York, 1999. 

[10] R. Schhckeiser, Cosmic ray astrophysics. Springer, Berhn, 2002. 

[11] E. M. Rossi, : Structure of gamma ray burst jets, Nuovo Cimento C Geophysics 
Space Physics C , Vol. 28 , (2005), pp. 387-+. 

[12] J. S. Bloom, D. A. Frail, R. Sari, : The Prompt Energy Release of Gamma-Ray 
Bursts using a Cosmological k-Correction, AJ , Vol. 121 , (2001), pp. 2879-2888. 

[13] M. Abramowitz, L A. Stegun, Handbook of mathematical functions with formulas, 
graphs, and mathematical tables, Dover, New York, 1965. 

[14] D. von Seggern, CRC Standard Curves and Surfaces, CRC, New York, 1992. 

[15] W. J. Thompson, Atlas for computing mathematical functions, Wiley-Interscience, 
New York, 1997. 

[16] I. Gradshteyn. I. Ryzhik , Table of Integrals, Series, and Products, Academic Press, 
San Diego, 2000. 

[17] A. Prudnikov, O. Brychkov , Y. Marichev , Integrals and Series, Gordon and Breach 
Science Publishers, Amsterdam, 1986. 

[18] M. Masoom Ah, S. Nadarajah, : A truncated pareto distribution. Computer 
Communications , Vol. 30 , (2006), pp. 1-4. 

[19] L. Zaninetti, : The Initial Mass Function as given by the fragmentation, 
Astronomische Nachrichten , Vol. 326 , (2005), pp. 754-759. 

[20] L. Zaninetti, A. Cellino, V. Zappala, : On the fractal dimension of the families of the 
asteroids., A&A , Vol. 294 , (1995), pp. 270-273. 



